Voice activation using prosodic features

نویسندگان

  • Marco Khne
  • Matthias Wolff
  • Matthias Eichner
  • Rüdiger Hoffmann
چکیده

In this paper we propose a voice activation method based on prosodic keyword verification. In current voice activation systems features like the fundamental frequency contour have not been considered so far. Normally a continuous listening word spotter is used to detect a certain predefined keyword. We conducted an experiment which shows that people emphasize this keyword when they address a recognizer. To capture the prosodic information we trained an HMM on the fundamental frequency and energy contour of the keyword. The prosodic model is used to verify the keyword hypotheses of a phonetic recognizer. We investigated the performance of the prosodic model to distinguish between the same keyword spoken in command and non-command phrases. The introduction of the prosodic information significantly reduced the false alarm rate whereas the detection rate was only slightly degraded.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The effect of bilateral subthalamic nucleus deep brain stimulation (STN-DBS) on the acoustic and prosodic features in patients with Parkinson’s disease: A study protocol for the first trial on Iranian patients

Background: The effect of subthalamic nucleus deep brain stimulation (STN-DBS) on the voice features in Parkinson’s disease (PD) is controversial. No study has evaluated the voice features of PD underwent STN-DBS by the acoustic, perceptual, and patient-based assessments comprehensively. Furthermore, there is no study to investigate prosodic features before and after DBS in PD. The curren...

متن کامل

Using Prosodic and Voice Quality Features for Paralinguistic Information Extraction

The use of voice quality features in addition to prosodic features is proposed for automatic extraction of paralinguistic information (like speech acts, attitudes and emotions) in dialog speech. Perceptual experiments and acoustic analysis are conducted for monosyllabic utterances spoken in several speaking styles, carrying a variety of paralinguistic information. Acoustic parameters related wi...

متن کامل

Using voice-quality measurements with prosodic and spectral features for speaker diarization

Jitter and shimmer voice-quality measurements have been successfully used to detect voice pathologies and classify different speaking styles. In this paper, we investigate the usefulness of jitter and shimmer voice measurements in the framework of the speaker diarization task. The combination of jitter and shimmer voice-quality features with the long-term prosodic and shortterm spectral feature...

متن کامل

On the fusion of prosody, voice spectrum and face features for multimodal person verification

Multimodal person recognition systems normally use shortterm spectral features as voice information. In this paper prosodic information is added to a system based on face and voice spectrum features. By using two fusion techniques, support vector machines and matcher weighting, different fusion strategies based on the fusion of monomodal scores in several steps are proposed. The performance of ...

متن کامل

On the relation between voice source parameters and prosodic features in connected speech

The behaviour of the voice source characteristics in connected speech was studied. Voice source parameters were obtained by automatic inverse filtering, followed by automatic fitting of a glottal waveform model to the data. Consistent relations between voice source parameters and prosodic features were observed.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004